Exit quiescence when splice_init and tx_init_rbf are rejected#4495
Exit quiescence when splice_init and tx_init_rbf are rejected#4495jkczyz wants to merge 5 commits intolightningdevkit:mainfrom
splice_init and tx_init_rbf are rejected#4495Conversation
|
👋 Thanks for assigning @wpaulino as a reviewer! |
splice_init and tx_init_rbf are rejected
a5d670c to
57aad34
Compare
8cdc480 to
959b553
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4495 +/- ##
==========================================
- Coverage 86.24% 86.22% -0.02%
==========================================
Files 160 161 +1
Lines 107909 108651 +742
Branches 107909 108651 +742
==========================================
+ Hits 93061 93681 +620
- Misses 12212 12339 +127
+ Partials 2636 2631 -5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
959b553 to
69a1a1c
Compare
| fn quiescent_negotiation_err(&mut self, err: ChannelError) -> InteractiveTxMsgError { | ||
| let exited_quiescence = | ||
| if matches!(err, ChannelError::Abort(_)) { self.exit_quiescence() } else { false }; | ||
| InteractiveTxMsgError { err, splice_funding_failed: None, exited_quiescence } | ||
| } |
There was a problem hiding this comment.
In splice_init, resolve_queued_contribution (which can return Abort(FeeRateTooHigh)) is called before validate_splice_init checks is_quiescent(). If a misbehaving peer sends splice_init during the STFU handshake (before quiescence is established), and the holder has a queued contribution with a low max_feerate, the Abort error would flow into quiescent_negotiation_err → exit_quiescence(), which has debug_assert!(!is_local_stfu_sent()) / debug_assert!(!is_remote_stfu_sent()).
Those asserts would fire because the STFU flags are still set (handshake incomplete). In release builds, exit_quiescence would harmlessly return false, but in debug/test builds this would panic.
Same issue applies to tx_init_rbf at line 12762 (and line 12780).
Consider guarding with an is_quiescent() check before calling exit_quiescence:
| fn quiescent_negotiation_err(&mut self, err: ChannelError) -> InteractiveTxMsgError { | |
| let exited_quiescence = | |
| if matches!(err, ChannelError::Abort(_)) { self.exit_quiescence() } else { false }; | |
| InteractiveTxMsgError { err, splice_funding_failed: None, exited_quiescence } | |
| } | |
| fn quiescent_negotiation_err(&mut self, err: ChannelError) -> InteractiveTxMsgError { | |
| let exited_quiescence = if matches!(err, ChannelError::Abort(_)) | |
| && self.context.channel_state.is_quiescent() | |
| { | |
| self.exit_quiescence() | |
| } else { | |
| false | |
| }; | |
| InteractiveTxMsgError { err, splice_funding_failed: None, exited_quiescence } | |
| } |
There was a problem hiding this comment.
Fixed, though I took the opportunity to have validate_splice_init and validate_ tx_init_rbf no longer call validate_splice_contributions and FundingScope::for_splice. Rather they are called at those call sites. This let's us check if we are quiescent before checking fee rates when calling resolve_queued_contribution.
|
I've completed a thorough review of the entire PR diff, examining every file and hunk for bugs, security issues, and logic errors. Verification performed:
No issues found. |
There was a problem hiding this comment.
Note that we'll now abort an in-progress RBF if the counterparty misbehaves by sending us tx_init_rbf. The spec isn't really clear on what to do here. There's a similar case in validate_splice_init where we'll disconnect if we already have a pending splice (even if negotiation is still in progress). The spec does indicate this should be done. So maybe both of these are fine?
When tx_init_rbf is rejected with ChannelError::Abort (e.g., insufficient RBF feerate, negotiation in progress, feerate too high), the error is converted to a tx_abort message but quiescence is never exited and holding cells are never freed. This leaves the channel stuck in a quiescent state. Fix this by intercepting ChannelError::Abort before try_channel_entry! in internal_tx_init_rbf, calling exit_quiescence on the channel, and returning the error with exited_quiescence set so that handle_error frees holding cells. Also make exit_quiescence available in non-test builds by removing its cfg gate. Update tests to use the proper RBF initiation flow (with tampered feerates) so that handle_tx_abort correctly echoes the abort and exits quiescence, rather than manually crafting tx_init_rbf messages that leave node 0 without proper negotiation state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
d275551 to
3c55d3c
Compare
|
Rebased |
lightning/src/ln/channelmanager.rs
Outdated
| &self.logger, | ||
| ); | ||
| if let Err(ChannelError::Abort(_)) = &init_res { | ||
| funded_channel.exit_quiescence(); |
There was a problem hiding this comment.
Hm I'm not sure we want to use this since the debug assertions there are reachable, but the mark_response_received call makes me realize we may need to do this elsewhere to avoid an unintentional disconnect (maybe write a test to confirm?).
There was a problem hiding this comment.
Hm I'm not sure we want to use this since the debug assertions there are reachable,
The fixup commit makes sure that we don't return ChannelError::Abort until we know we are quiescent, so I think those asserts are unreachable from here?
but the
mark_response_receivedcall makes me realize we may need to do this elsewhere to avoid an unintentional disconnect (maybe write a test to confirm?).
Like when processing splice_init, splice_ack, tx_init_rbf, tx_ack_rbf, or interactive-tx messages?
One thing I noticed when exploring this as since we don't have a FundedChannel for v2 establishment, we use UNFUNDED_CHANNEL_AGE_LIMIT_TICKS to timeout instead. So updating sent_message_awaiting_response on the ChannelContext has no effect there. We'd effectively have two different mechanisms for timing out. Are we ok with that? Or should we try to use the UNFUNDED_CHANNEL_AGE_LIMIT_TICKS for splice funding, too?
| ); | ||
|
|
||
| // Node 1 handles the echo (no-op since it already aborted). | ||
| nodes[1].node.handle_tx_abort(node_id_0, &tx_abort_echo); |
There was a problem hiding this comment.
We should include a payment such that the outbound HTLC gets sent out after the abort as a result of freeing the holding cell.
Add a payment during quiescence in test_splice_rbf_insufficient_feerate to verify that outbound HTLCs queued in the holding cell are freed when quiescence is exited by the tx_abort exchange. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The same bug fixed in the prior commit for tx_init_rbf also exists in internal_splice_init: when splice_init triggers FeeRateTooHigh in resolve_queued_contribution, the ChannelError::Abort goes through try_channel_entry! without exiting quiescence. Apply the same fix: intercept ChannelError::Abort before try_channel_entry!, call exit_quiescence, and return the error with exited_quiescence set. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The prior two commits manually intercepted ChannelError::Abort in the channelmanager handlers for splice_init and tx_init_rbf to exit quiescence before returning, since the channel methods didn't signal this themselves. The interactive TX message handlers already solved this by returning InteractiveTxMsgError which bundles exited_quiescence into the error type. Apply the same pattern: change splice_init and tx_init_rbf to return InteractiveTxMsgError, adding a quiescent_negotiation_err helper on FundedChannel that exits quiescence for Abort errors and passes through other variants unchanged. Extract handle_interactive_tx_msg_err in channelmanager to deduplicate the error handling across internal_tx_msg, internal_splice_init, and internal_tx_init_rbf. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
validate_splice_init and validate_tx_init_rbf check preconditions including quiescence. Move them before resolve_queued_contribution so that a misbehaving peer sending splice_init or tx_init_rbf before quiescence is rejected early. This also moves validate_splice_contributions and FundingScope construction into the callers since they depend on the resolved contribution. Have validate_tx_init_rbf return the last candidate's funding pubkeys so the caller can construct FundingScope without re-accessing pending_splice. Add a debug_assert in quiescent_negotiation_err that Abort errors only occur when the channel is quiescent, since exit_quiescence requires it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3c55d3c to
e4b1c6b
Compare
Summary
tx_init_rbforsplice_initis rejected withAbort(e.g., insufficient RBF feerate, negotiation in progress), which left the channel stuck in a quiescent stateInteractiveTxMsgError, reusing the same pattern already used by the interactive TX message handlers, with a sharedhandle_interactive_tx_msg_errhelper in channelmanagerTest plan
test_splice_rbf_insufficient_feerateupdated to verify quiescence is properly exited aftertx_aborttest_splice_feerate_too_highupdated to verify quiescence is properly exited aftersplice_initrejection🤖 Generated with Claude Code
Based on #4494.